An Empirical Survey of Data Augmentation for Limited Data Learning in NLP

نویسندگان

چکیده

Abstract NLP has achieved great progress in the past decade through use of neural models and large labeled datasets. The dependence on abundant data prevents from being applied to low-resource settings or novel tasks where significant time, money, expertise is required label massive amounts textual data. Recently, augmentation methods have been explored as a means improving efficiency NLP. To date, there no systematic empirical overview for limited setting, making it difficult understand which work settings. In this paper, we provide an survey recent summarizing landscape (including token-level augmentations, sentence-level adversarial hidden-space augmentations) carrying out experiments 11 datasets covering topics/news classification, inference tasks, paraphrasing single-sentence tasks. Based results, draw several conclusions help practitioners choose appropriate augmentations different discuss current challenges future directions learning

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data envelopment analysis in service quality evaluation: an empirical study

Service quality is often conceptualized as the comparison between service expectations and the actual performance perceptions. It enhances customer satisfaction, decreases customer defection, and promotes customer loyalty. Substantial literature has examined the concept of service quality, its dimensions, and measurement methods. We introduce the perceived service quality index (PSQI) as a sing...

متن کامل

metrics for the detection of changed buildings in 3d old vector maps using als data (case study: isfahan city)

هدف از این تحقیق، ارزیابی و بهبود متریک های موجود جهت تایید صحت نقشه های قدیمی سه بعدی برداری با استفاده از ابر نقطه حاصل از لیزر اسکن جدید شهر اصفهان می باشد . بنابراین ابر نقطه حاصل از لیزر اسکنر با چگالی حدودا سه نقطه در هر متر مربع جهت شناسایی عوارض تغییر کرده در نقشه های قدیمی سه بعدی استفاده شده است. تمرکز ما در این تحقیق بر روی ساختمان به عنوان یکی از اصلی ترین عارضه های شهری می باشد. من...

An empirical survey of Linked Data conformance

There has been a recent, tangible growth in RDF published on the Web in accordance with the Linked Data principles and best practices, the result of which has been dubbed the “Web of Data”. Linked Data guidelines are designed to facilitate ad hoc re-use and integration of conformant structured data—across the Web—by consumer applications; however, thus far, systems have yet to emerge that convi...

متن کامل

development and implementation of an optimized control strategy for induction machine in an electric vehicle

in the area of automotive engineering there is a tendency to more electrification of power train. in this work control of an induction machine for the application of electric vehicle is investigated. through the changing operating point of the machine, adapting the rotor magnetization current seems to be useful to increase the machines efficiency. in the literature there are many approaches wh...

15 صفحه اول

Semi - supervised Learning Methods for Data Augmentation

The original goal of this project was to investigate the extent to which data augmentation schemes based on semi-supervised learning algorithms can improve classification accuracy in supervised learning problems. The objectives included determining the appropriate algorithms, customising them for the purposes of this project and providing their Matlab implementations. These algorithms were to b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of the Association for Computational Linguistics

سال: 2023

ISSN: ['2307-387X']

DOI: https://doi.org/10.1162/tacl_a_00542